New tool addition: amas tool #7443

jchchiu · 2025-11-06T18:29:00Z

FOR CONTRIBUTOR:

I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
License permits unrestricted use (educational + commercial)
This PR adds a new tool or tool collection
This PR updates an existing tool or tool collection
This PR does something else (explain below)

Regarding issue #7442

Implemented the amas alignment concatenation action needed for the Biohackathon workflow
Added a simple test case with corresponding expected outputs

The AMAS commands are:
  concat      Concatenate input alignments
  convert     Convert to other file format
  replicate   Create replicate data sets for phylogenetic jackknife
  split       Split alignment according to a partitions file
  summary     Write alignment summary
  remove      Remove taxa from alignment
  translate   Translate DNA alignment into protein alignment

Command flags can be seen here
Note, The Tool Standard Output for concat will say "Wrote concatenated sequences to fasta file 'concatenated.out'"; however the file obtained will be renamed with the output format you have chosen (e.g. 'concatenated.fasta' for fasta, 'concatenated.nex' for nexus and nexus-int)

To do:

Add more test coverage
Will also try to implement replicate and split (need to workout how to do outputs); others are already covered by existing Galaxy tools (unless there is something you think looks interesting to implement)
Clean up code and help text
Fix --check-align

For now, is it possible to review the code to see if it's on the right track, and if there are any better ways to structure it?

jchchiu · 2025-11-07T05:31:19Z

See updated changes from George at jchchiu#1

tools/amas/amas.xml

tools/amas/macros.xml

tools/amas/amas.xml

bernt-matthias · 2025-11-07T09:19:51Z

tools/amas/amas.xml

+        <param name="in_format" type="select" label="Format of the input file">
+            <option value="fasta">fasta</option>
+            <option value="phylip">phylip</option>
+            <option value="phylip-int">phylip-int</option>
+            <option value="nexus">nexus(sequential)</option>
+            <option value="nexus-int">nexus(interleaved)</option>
+        </param>


fasta phylip and nexus can be distinguishe automatically, e.g. $input_file.ext gives the Galaxy datatype. Is the info on interleaved/not needed? Can it be determined automatically?

https://github.com/marekborowiec/AMAS/blob/2e93d31638625135aa48a68251c363ac23a47c4a/amas/AMAS.py#L692

It doesn't seem like they have a function that detects the format for interleaved automatically; instead it depends on the input you give it. Can galaxy automatically distinguish interleaved?

tools/amas/amas.xml

bernt-matthias · 2025-11-07T09:21:43Z

tools/amas/amas.xml

+        </collection>
+
+        <collection name="converted_alignments" type="list" label="Converted alignments">
+            <discover_datasets directory="run_dir/convert" pattern="(?P&lt;name&gt;.+)-out\..+" format="data" />


We should set the format instead if format="data"

Could you have a look at the amas_split.xml at L55; is this what you were thinking?

jchchiu · 2025-11-11T00:42:47Z

Hey @bernt-matthias, could you have a look at amas_concat.xml and see if this is on the right track? If so, I'll update the rest of the subcommands with your suggestions.
Cheers for all the thorough comments.

tools/amas/amas.xml

tools/amas/macros.xml

tools/amas/.shed.yml

tools/amas/amas_concat.xml

bernt-matthias · 2025-11-12T09:12:03Z

tools/amas/amas_concat.xml

+            concat
+            --concat-part partitions.txt
+            --concat-out concatenated.out
+            --part-format $part_format


You can determine the input format from $input_files.ext.

Comment also relevant to #7443 (comment)

The problem I have with this is that if it is a nexus or phylip file, their extension doesn't always tell whether it is an interleaved or sequential format. Even if you sniff it as an interleaved does $input_files.ext return phlyip-int or something like that which differentiates it from normal phylip? Otherwise I'm pretty sure amas needs the user to explicitly set the file format as an input.

What are you thoughts on only taking non-interleaved formats, and give a warning to the user that it will not accept interleaved in the help or something? Following this also removing the option to output it as an interleaved file? Problem I see with this is that they can still upload an interleaved file since they have the same extension.

I would suggest to leave it for now work on an extension of the datatypes. Do you think the inserleaved / sequential datatypes are well defined? Are you interested in working on this. I could try to give you some pointers.

Alternatively we could also implement a small helper script that checks if the data is interleaved. Seems rather trivial.
For the output my suggestion would be a boolean interleaved: yes/no. Plus a select: as input / phylip / nexus / fasta?

bernt-matthias · 2025-11-12T09:16:52Z

tools/amas/amas_concat.xml

+               help="A file defining how the concatenated alignment is split into separate gene/locus regions. Each line specifies a partition name and its position range (e.g., 'gene1 = 1-500' or 'DNA, gene1 = 1-500' for RAxML format).">
+            <option value="nexus">nexus</option>
+            <option value="raxml">raxml</option>
+            <option value="unspecified" selected="true">unspecified</option>


What happens in the unspecified case?

It just has the genes and their start and end; should I add more context or just direct them to the help section?

Unspecified: gene1 = 1-500

RAxML: DNA, gene1 = 1-500

NEXUS:

#NEXUS Begin sets; charset gene1 = 1-500; End;

tools/amas/amas_remove.xml

tools/amas/amas_replicate.xml

tools/amas/amas_split.xml

tools/amas/macros.xml

tools/amas/amas_remove.xml

…d some formatting

tools/amas/amas_concat.xml

tools/amas/macros.xml

tools/amas/amas_summary.xml

tools/amas/macros.xml

jchchiu · 2025-11-19T06:44:36Z

I've been testing the split subcommand again and it seems like AMAS doesn't work when you use a RAxML or NEXUS formatted partitions file as an input.

The regex operator only works for the unspecified partitions:
matches = re.finditer(r"^(\s+)?([^ =]+)[ =]+([\0-9, -]+)", self.in_file_lines, re.MULTILINE)

I've updated the subcommand accordingly with some more info.

…titions; removed with note and more info

jchchiu added 6 commits November 7, 2025 05:04

feat: add amas and macros

917c7aa

test: add simple working test case

bb629f6

fix: change category to Multiple Alignments

be4db1f

fix: change category to Sequence Analysis

932783d

update from george

793a6b1

update from george; add tests

af75ec9

jchchiu added 4 commits November 7, 2025 16:35

update from george; add info.xml

a45c8b5

fix lint

e816d9c

add split test; update .shed; add comment to xml command

9967a62

update .shed owners

8e937d7

bernt-matthias reviewed Nov 7, 2025

View reviewed changes

jchchiu added 5 commits November 7, 2025 22:28

remove translate

a6ff62e

docs: update .shed

c354605

refactor: split concat into separate tool

a4fc62f

refactor: add input and output format as shared macro

6a56045

refactor: add macro for changing output format

426a577

jchchiu added 8 commits November 11, 2025 17:38

refactor: move info to macros

c757008

refactor: change tool id/name; remove info macro

1509d85

docs: update categories; reduce actions

6872743

refactor: rename output format

c77e246

refactor: move 'split' subcommand into separate tool

582d254

refactor: change output pattern

bc9bebd

refactor: move 'replicate' subcommand into separate tool

dc15ac1

docs: add more help to explain what partitions are

a279552

SaimMomin12 reviewed Nov 11, 2025

View reviewed changes

tools/amas/amas.xml Outdated Show resolved Hide resolved

tools/amas/amas.xml Outdated Show resolved Hide resolved

tools/amas/macros.xml Outdated Show resolved Hide resolved

SaimMomin12 changed the title ~~Add amas(1.0) tool~~ New tool addition: amas tool Nov 11, 2025

jchchiu added 2 commits November 12, 2025 11:10

refactor: move 'summary' subcommand into separate tool

1d901f5

temp: move 'remove' subcommand into separate tool

77241c3

docs: rename output label so that it is more user friendly

e2de21b

jchchiu requested review from SaimMomin12 and bernt-matthias November 12, 2025 06:01

bernt-matthias reviewed Nov 12, 2025

View reviewed changes

jchchiu added 10 commits November 17, 2025 12:45

docs: add auto_tool_repositories and suite to shed.yml

b3c6135

refactor: run everything in ./; added ftype to tests

4b43895

refactor: changed check_align and data_type to macro

f6a85d5

refactor: moved shared commands to macro tokens

08bd74d

refactor/docs: moved shared help to macro token

b19d9b7

refactor: added ${tool.name} on ${on_string} to output labels

b61f0ff

docs: updated file format formatting to be more consistent

846b254

style: removed single quotes

eec0620

docs: updated docs to include info on sequential vs interleaved; fixe…

8364cfe

…d some formatting

docs: moved partitions help to macro token

834f114

jchchiu requested a review from bernt-matthias November 17, 2025 04:25

bernt-matthias reviewed Nov 18, 2025

View reviewed changes

jchchiu added 9 commits November 19, 2025 12:21

refactor: set format depending on part_format

2d2349b

style: changed formatting of output files

0e62561

fix: updated version command

4af9562

tests: changed concat test from sim size to exact

cfcfca9

refactor: simplified change_format

d4b84ac

fix: updated/fixed concat test

51bb36e

fix: added nex format to allowed inputs for partitions

ff762fb

docs: updated help

3d9424b

style: fix lint

18a8396

jchchiu added 3 commits November 19, 2025 17:45

fix: split subcommand does not work with RAxML or NEXUS formatted par…

bd9a818

…titions; removed with note and more info

docs: added some comments for future

0aae4cb

style: cleaned up indenting

96395ca

jchchiu requested a review from bernt-matthias November 20, 2025 03:50

New tool addition: amas tool #7443

Are you sure you want to change the base?

New tool addition: amas tool #7443

Uh oh!

Conversation

jchchiu commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regarding issue #7442

To do:

Uh oh!

jchchiu commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jchchiu commented Nov 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jchchiu commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jchchiu commented Nov 6, 2025 •

edited

Loading

jchchiu commented Nov 7, 2025 •

edited

Loading